I would like to introduce a tool for masking sensitive information in HAR files
Hello, this is Makoto Nakamura from Annotation.
Today, I would like to introduce a tool for masking sensitive information in HAR files.
Precaution
Please note that there is no guarantee that all sensitive information will be removed using the tool introduced in this article.
Be sure to check that no sensitive information remains before providing any information.
Tool Introduction
I will provide a link to the tool for now, and further details will be explained later.
What is a HAR File?
As mentioned in our company blog, HTTP Archive (HAR) files are JSON files that record the latest network activity in a browser.
Since there are cases where obtaining and sharing HAR files is necessary when contacting AWS Support, they are also introduced in the AWS Knowledge Center.
HAR files, which are necessary for troubleshooting, may sometimes capture sensitive information such as passwords and cookies.
Both our technical support and AWS Support request that any sensitive information be masked before sharing the files provided by customers.
Again, remove any confidential information from them.
Bulk replacement can be challenging.
HAR files may contain numerous cookies and other sensitive information, and manually masking all of this information can be time-consuming.
Additionally, since the sensitive information in the HAR file's JSON may not follow a consistent format, as shown below, it can be challenging to perform replacements using a text editor.
{
"name": "cookie",
"value": "xxxxx"
}
{
"cookies": [
{
"name": "cookie-name",
"value": "xxxx",
"path": "/",
"domain": "example.xom",
"expires": "yyyy-mm-ddThh:mm:ss.000Z",
"httpOnly": true,
"secure": true,
"sameSite": "None"
}
]
}
As an experiment, I obtained a HAR file from the AWS Management Console and searched for "cookie" in a text editor, resulting in 1,500 hits.
Manually masking all of these would be highly inefficient.
Of course, it is essential to mask sensitive information for security reasons, but what engineers want to focus on is development and operational tasks, not masking sensitive information.
Therefore, I investigated tools that can mask sensitive information in HAR files in bulk and found three tools, which I will introduce below.
Replacement Tool 1: HAR Sanitizer
The first tool is HAR Sanitizer.
This tool is available in Google's GitHub repository, but it is clearly stated within the repository that it is not an official Google product.
The usage is simple: just access the above GitHub and visit the live version at the following URL:
https://har-sanitizer.appspot.com/
When you access the above URL, the following screen will be displayed.
When you click "LOAD HAR" in the upper right corner and select a HAR file, four options will be displayed: "COOKIES," "HEADERS," "URLQUERY/POSTDATA PARAMS," and "CONTENT MIMETYPES."
For each item, there is a checkbox, and by turning on the checkbox, the corresponding item will be masked.
As a test, use a dummy HAR file, select "All Cookies" under "COOKIES," and download the HAR file.
When you check the values of the cookies in the downloaded HAR file, you will see that they have been masked as shown below.
{
"cookies": [
{
"name": "1P_JAR",
"value": "[1P_JAR redacted]",
"path": "/",
"domain": "example.xom",
"expires": "yyyy-mm-ddThh:mm:ss.000Z",
"httpOnly": true,
"secure": true,
"secure": true,
"sameSite": "None"
}
]
}
It appears that the tool replaces the cookie values with "[cookie-name redacted]."
Other sensitive information is also masked with the same value.
This is quite convenient!
However, a concern with HAR Sanitizer is the risk associated with uploading HAR files containing sensitive information to a website.
If the contents of the HAR file are stolen by a third party, there is a danger of unauthorized access to the AWS environment.
Therefore, I investigated tools that can be executed in a local environment and found the second tool, which I will introduce next.
Replacement Tool 2: harsanitizer-docker
As the name suggests, it is a tool that allows you to use the previously introduced HAR Sanitizer in a Docker environment.
I think this is a GitHub repository from an individual developer, not Google.
However, I have confirmed that it works, so I will introduce it.
As a prerequisite, you need to install Docker, so please refer to the Docker documentation for the installation process.
The contents of harsanitizer-docker are the same as HAR Sanitizer, so all you need to do is follow the README and execute the docker run
command.
$ docker run -d -p 8080:8080 scottmcmaster/harsanitizer:1.1
Once the container is started, access the localhost on port 8080.
As confirmed below, HAR Sanitizer can be accessed in the local environment as well.
After that, follow the same steps as with HAR Sanitizer.
This way, you can mask sensitive information without uploading HAR files to a website.
For reference, here are the steps to launch it in Cloud9.
- Create a Cloud9 EC2 environment.
- In the terminal, run the command
docker run -d -p 8080:8080 scottmcmaster/harsanitizer:1.1
. - After the command execution is complete, click on "Preview Running Application".
- Although HAR Sanitizer can be used in the preview state, if you want to enlarge the screen, click the icon in the upper right corner to open it in a separate tab.
Replacement Tool 3: cloudflare/har-sanitizer
Cloudflare also provides har-sanitizer on their GitHub.
Here are the necessary steps:
$ git clone https://github.com/cloudflare/har-sanitizer.git
$ cd har-sanitizer
$ npm run dev
If you encounter any errors related to concurrently, please ensure that concurrently is also installed.
$ npm i concurrently
$ npm run dev
If the execution is successful, a local URL will be displayed as shown below. Access this URL:
➜ Local: http://127.0.0.1:3000/
As a side note
I wasn't the only one investigating methods to mask sensitive information in HAR files in bulk.
The reference site mentioned above also introduced a method to replace information using commands.
If you prefer using commands, please refer to the reference site.
Summary
In this article, I introduced a tool for masking sensitive information in HAR files.
Since HAR files can sometimes contain sensitive information unexpectedly, it's important to use masking tools effectively to protect company and personal information.
We hope you find this article helpful.
References
- google/har-sanitizer
- scottmcmaster/harsanitizer-docker: Docker image build for har-sanitizer
- cloudflare/har-sanitizer
- HARファイルの取得について | DevelopersIO
- Create a HAR file and Console logs for an AWS Support case | AWS re:Post
- 技術的なお問い合わせに関するガイドライン | AWS サポート
- Install | Docker Docs
- Step 1: Create an environment - AWS Cloud9
- Docker tutorial for AWS Cloud9 - AWS Cloud9
- jq - How to remove Cookie value from HAR file before sharing it with another person - Stack Overflow
- Do not include cookies in HAR files [40441005] - Chromium
アノテーション株式会社について
アノテーション株式会社はクラスメソッドグループのオペレーション専門特化企業です。サポート・運用・開発保守・情シス・バックオフィスの専門チームが、最新 IT テクノロジー、高い技術力、蓄積されたノウハウをフル活用し、お客様の課題解決を行っています。当社は様々な職種でメンバーを募集しています。「オペレーション・エクセレンス」と「らしく働く、らしく生きる」を共に実現するカルチャー・しくみ・働き方にご興味がある方は、アノテーション株式会社 採用サイトをぜひご覧ください。